Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[api-extractor] Extract source file locations for relevant API items #3590

Merged
merged 11 commits into from
Oct 11, 2022

Conversation

zelliott
Copy link
Contributor

@zelliott zelliott commented Aug 18, 2022

Summary

Fixes #895.

This PR adds a new sourceFile property to certain API items that tracks the source file where that item was declared. This allows an API reference site to include links back to the original source code. It largely follows the design agreed upon by @octogonz and @ecraig12345.

Details

ApiDeclaredItem.sourceFile

  • A new sourceFile property was added to the ApiDeclaredItem class as opposed to some other API item class or mixin because it only makes sense for declared items (items with source code excerpts).
  • We first try to serialize the path to the original .ts source file. If we can't resolve the original .ts source file for some reason, then we fall back to the .d.ts file. If this also fails for some reason (should be impossible), then the property is undefined.
  • The path is relative to the <projectFolder> token as defined in the api-extractor.json config file.

SourceMapper changes

MessageRouter uses SourceMapper today to resolve mappings between .d.ts and .ts files. I made a series of changes to the SourceMapper to make it more generic such that now both MessageRouter and ApiModelGenerator can use it.

Tracking source file position

#895 discussed tracking other information in additional to the source file (e.g. start/end position, line number, column number, buffer offset, etc). This PR only tracks the source file because that's all that was requested in the original issue (#895 (comment)) and that's sufficient for my use case.

Scenarios when sourceFile points to a .d.ts file

There are only two cases in the api-extractor-scenarios tests where the sourceFile property for an API item points to a .d.ts file:

  1. The API item is in a separate package, but bundledPackages is turned on. We don't have access to the original .ts files, so .d.ts is the best we can do.
  2. There are simply no .ts source files due to the nature of the test. In the excerptTokens test, there are no .ts source files (only .d.ts that were manually written), and thus the sourceFile property for API items point to .d.ts files.

AstNamespaceImport items

AstNamespaceImport items are special because they aren't declared as namespaces in the original source files with the namespace keyword. Instead, they are "synthetically" created by API Extractor to represent import * as ns from './file' statements.

They are represented in the .api.json as AstNamespace items, and their sourceFile points to the file where the import * as ns from './file' statement is. I considered the sourceFile pointing to the './file' file instead, but I thought this would be more confusing for users of an API reference site.

API items spanning multiple files

Today, it's not possible for declarations across multiple .ts files to be represented by a single API item. But this may be possible in the future. In this case, which source file do we track?

I don't think it's worth trying to design for this right now, largely because the .api.json as-is (before this feature) is already fundamentally incompatible with API items spanning multiple files (and even multiple declarations). For example, two namespaces with the same name in the same file are represented today as one AstNamespace item. However, the docComment and excerptToken fields are from the first namespace, the second is ignored (even though the AstNamespace still has the members from both namespaces). This feels like an pre-existing limitation in the .api.json that this feature doesn't make more complicated to solve.

If we did want to solve it, we'd probably have to change the API item hierarchy such that a single API item can have multiple "declarations". The ApiDeclaredItem class could be updated to have a list of docComment, excerptTokens, and sourceFile fields, probably bundled together in some new intermediary interface (so that you know which doc comment is associated with which excerpt tokens and source file). But this would be a big breaking change that is probably smart to land separately.

JSON verbosity & stability

This feature increases the size of the .api.json by adding a new field to essentially all API items and decreases the stability of the .api.json (i.e. an API item moved between two local files will cause an .api.json change).

How it was tested

Ran rush rebuild and inspected all of the .api.json files.

@@ -7,6 +7,7 @@
import { DeclarationReference } from '@microsoft/tsdoc/lib-commonjs/beta/DeclarationReference';
import { DocDeclarationReference } from '@microsoft/tsdoc';
import { IJsonFileSaveOptions } from '@rushstack/node-core-library';
import { JsonObject } from '@rushstack/node-core-library';
Copy link
Contributor Author

@zelliott zelliott Aug 30, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why this is appearing... 🤔. Perhaps an API Extractor bug.

return undefined;
}

return path.join(this._projectFolderUrl, this._fileUrlPath);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that path.join doesn't quite work here, it shouldn't be used to construct URLs. For example, it will turn "http://something" into "http:/something".

But simple concatenation doesn't work in all cases either, because this._fileUrlPath can include "../" segments (see the api-extractor-scenarios bundledPackages test scenario).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would the URL constructor work here?

It should solve the outlined issue, e.g. new URL("../path", "http://something/abc").href => http://something.com/path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it quite solves the issue above. new URL("path", "http://something.com/abc").href => "http://something.com/path", but we'd want "http://something.com/abc/path".

this. _fileUrlPath is always relative to the <projectFolder>. Thus, it should only include "../" segments if it's pointing to a file outside of <projectFolder>. I'm wondering if maybe this can only happen if it's pointing to a .d.ts file - as in the case of the bundledPackages test scenario. If that's the case, then perhaps SourceLocation.fileUrl can return undefined for .d.ts files, as I don't believe they're typically checked into source control, so a constructed URL wouldn't be valid anyway? Then we can just use simple concatenation here.

@octogonz - what do you think?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if it quite solves the issue above. new URL("path", "http://something.com/abc").href => "http://something.com/path", but we'd want "http://something.com/abc/path".

If you include a trailing slash in the base, it will append the path as you'd want it.
e.g. new URL("path", "http://something.com/abc/").href => "http://something.com/abc/path"

Sorry if I'm wasting your time with this, I'm not very familiar with the internals of this project 😓

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh nice! It seems like new URL could work well in this case then. And you're not wasting my time at all!

Still not sure if we want to handle paths to .d.ts files separately, but In the meantime I can update this to use new URL instead, as it's strictly better than path.join.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Keep in mind that new URL("Äpfel", "http://something.com/abc/").href prints http://something.com/abc/%C3%84pfel )

// "includeForgottenExports": false,

/**
* The URL to the `<projectFolder>` token where the project's source code can be viewed on a website like GitHub or
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this relate to the <projectFolder> token? The serialized paths are relative to the <projectFolder> value, but otherwise the projectFolderUrl is a URL not a disk path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment mentions the <projectFolder> token because this is the URL to the projectFolder folder on a source code viewing website like GitHub. Might this be less confusing if I use projectFolder instead of <projectFolder>? Maybe that's where the confusion is coming from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this comment.

pos: declaration.pos
});

return path.relative(this._collector.extractorConfig.projectFolder, sourceLocation.sourceFilePath);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zelliott Here we are computing a filesystem path, but then returning it as a "URL path". For example, if the source file is called "src/@bad+folder/Example.ts" then at some point it may need to get encoded into something like src/%40bad%2Bfolder/Example.ts to avoid trouble in the web server.

How we encode it (URI? IRI?) perhaps depends on the particular target website. So I think it maybe makes sense to preserve a "raw" URL-like path here, and document clearly in the API model that the consumer is responsible for URL encoding. And we could demonstrate this by performing such encoding appropriately in API Documenter, at the step where it emits its Markdown hyperlinks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows OS, path.relative() will in fact include backslashes instead of slashes, which needs to be addressed here, though.

Copy link
Contributor Author

@zelliott zelliott Oct 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think it maybe makes sense to preserve a "raw" URL-like path here, and document clearly in the API model that the consumer is responsible for URL encoding.

If I understand you correctly, you're suggesting to just update the SourceLocation.fileUrl doc comment to clarify that URL encoding is a responsibility of the consumer?

I'm wondering if URL construction altogether should be a responsibility of the consumer. The construction we're currently performing feels a little fragile, and I'm not sure if it's possible to improve it. See the discussion at #3590 (comment).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use path.posix.relative instead.

@octogonz octogonz merged commit 2f8d789 into microsoft:main Oct 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[api-extractor] Optionally include source code coordinates in generated documentation
3 participants